Chinese Comma Disambiguation for Discourse Analysis
نویسندگان
چکیده
The Chinese comma signals the boundary of discourse units and also anchors discourse relations between adjacent text spans. In this work, we propose a discourse structureoriented classification of the comma that can be automatically extracted from the Chinese Treebank based on syntactic patterns. We then experimented with two supervised learning methods that automatically disambiguate the Chinese comma based on this classification. The first method integrates comma classification into parsing, and the second method adopts a “post-processing” approach that extracts features from automatic parses to train a classifier. The experimental results show that the second approach compares favorably against the first approach.
منابع مشابه
Maximum Entropy for Chinese Comma Classification with Rich Linguistic Features
Discourse relation is an important content of discourse semantic analysis, and the study of punctuation is of importance for discourse relation. In this paper, we propose a method of Chinese comma classification based on maximum entropy (ME). This method classifies the sentence relation based on comma with ME by extracting rich linguistic features before and after the commas in sentences. Exper...
متن کاملA Clause-Level Hybrid Approach to Chinese Empty Element Recovery
Empty elements (EEs) play a critical role in Chinese syntactic, semantic and discourse analysis. Previous studies employ a language-independent sentence-level approach to EE recovery, by casting it as a linear tagging or structured parsing problem. In comparison, this paper proposes a clauselevel hybrid approach to address specific problems in Chinese EE recovery, which recovers EEs in Chinese ...
متن کاملDetection, Disambiguation and Argument Identification of Discourse Connectives in Chinese Discourse Parsing
In this paper, we investigate four important issues together for explicit discourse relation labelling in Chinese texts: (1) discourse connective extraction, (2) linking ambiguity resolution, (3) relation type disambiguation, and (4) argument boundary identification. In a pipelined Chinese discourse parser, we identify potential connective candidates by string matching, eliminate non-discourse ...
متن کاملPunctuation, Prosody, and Discourse: Afterthought Vs. Right Dislocation
In a reading production experiment we investigate the impact of punctuation and discourse structure on the prosodic differentiation of right dislocation (RD) and afterthought (AT). Both discourse structure and punctuation are likely to affect the prosodic marking of these right-peripheral constructions, as certain prosodic markings are appropriate only in certain discourse structures, and punct...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012